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measures  are  presented  in  application  oriented  tables.  Measures  suitable  (or 
unsuitable)  for  repeated  measurements  are  identified  and  compared.  It  is  our 
opinion  that  the  30  measures  in  the  Recommended  category  should  be  given  first 
consideration  for  environmental  research  applications.  Further,  it  is 
recommended  that  information  pertaining  to  preexperimental  practice 
requirements  and  stabilized  reliabilities  should  be  utilized  in  repeated 
measures  environmental  studies. 
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SUMMARY  PAGE 


PROBLEM 

The  goal  of  the  Performance  Evaluation  Tests  for  Environmental  Research 
(PETER)  Program  was  to  identify  a  set  of  measures  of  human  cognitive, 
perceptual,  and  motor  capabilities  for  use  in  the  study  of  environmental  and 
other  time-course  effects.  Tasks  were  evaluated  as  suitable  for  repeated 
measures  applications  when  their  intertrial  means,  variances,  and  correlations 
were  well-behaved  under  constant  baseline  conditions.  The  results  of  this 
program  are  documented  in  more  than  90  reports.  Because  of  the  volume  of  this 
literature,  a  review  is  needed  to  enhance  the  applicability  of  the  results. 

FINDINGS 

This  report  provides  an  evaluation  of  112  measures  studied  in  the  PETER 
Program.  They  are  categorized  into  four  groups  based  upon  consideration  of 
task  stability  and  task  definition.  The  Recommended  category  contained  30 
measures  that  clearly  obtained  total  stabilization  and  had  an  acceptable  level 
of  reliability  efficiency  { i . e . ,  rxx  *50,  when  normalized  to  a  three  minute 
administration).  The  Acceptable-But-Redundant  category  contained  15  measures 
that  met  the  same  requirements  as  the  Recommended,  but  were  found  redundant. 
The  35  measures  in  the  Marginal  category  usually  had  desirable  features  which 
were  outweighed  by  faults.  The  32  measures  in  the  Unacceptable  category  were 
characterized  by  either  differential  instability  or  weak  reliability 
efficiency  (rxx  <  .15).  This  category  contained  an  inordinate  number  of  slope 
and  other  derived  measures.  Characteristics  of  the  measures  are  presented  in 
application  oriented  tables.  Measures  suitable  (or  unsuitable)  for  repeated 
measurements  are  identified  and  compared. 

RECOMMENDATIONS 

It  is  our  opinion  that  the  30  measures  in  the  Recommended  category  should 
be  given  first  consideration  for  environmental  research  applications. 

Further,  it  is  recommended  that  information  pertaining  to  preexperimental 
practice  requirements  and  stabilized  reliabilities  should  be  utilized  in 
repeated  measures  environmental  studies. 
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PERFORMANCE  EVALUATION  TESTS  FOR  ENVIRONMENTAL  RESEARCH  (PETER): 

EVALUATION  OF  112  MEASURES 


Performance  Evaluation  Tests  for  Environmental  Research  (PETER),  a 
program  to  evaluate  the  suitability  of  human  performance  tests  for  repeated 
measures  appl ications,  has  been  underway  since  1977  (50,  62).  The  goal  of 
this  program  was  to  identify  a  set  of  measures  of  human  cognitive,  perceptual, 
and  motor  capabilities  for  use  in  the  study  of  environmental  and  other, 
time-course  effects.  Environmental  stressors,  for  example,  those  experienced 
in  Navy  workplaces  such  as  aboard  ship,  may  reduce  well-being  and 
productivity.  The  gross  effects  of  such  arduous  environments  are  readily 
observable,  but  in  order  to  detect  more  subtle  effects,  a  sensitive  measuring 
device  is  necessary.  The  PETER  Battery  has  been  designed  to  be  sensitive  to 
changes  in  performance  and  for  other  repeated  measures  applications. 

Prior  to  the  advent  of  the  PETER  Program,  concerted  efforts  at  research 
on  the  differential  effects  of  practice  on  test  characteristics  had  not 
appeared  with  any  regularity  in  the  recent  literature  (37,  59,  62).  Yet  it  is 
only  with  such  a  paradigm  that  subtle  changes  in  performance  can  be  most 
efficiently  detected  (110).  In  previous  battery  development,  attention  was 
paid  to  the  stability  of  the  means,  and  to  a  lesser  extent  to  the  stability  of 
the  standard  deviations  or  variances.  The  PETER  Program  focused  also  on  the 
stability  and  reliability  of  the  intertrial  correlations  (62). 

Tasks  were  evaluated  as  suitable  for  inclusion  in  the  battery  when  their 
intertrial  means,  variances  and  correlations  were  well-behaved  under  constant 
baseline  conditions  (62).  The  tests  were  drawn  from  environmental, 
information  processing,  neuropsychological,  and  microcomputer  task  batteries 
(64,  65).  More  than  140  performance  measures  were  evaluated  and  documented  in 
90  reports  (50).  Because  of  the  volume  of  this  literature,  a  review  focused 
on  the  utility  of  tasks  is  needed  to  enhance  applicability.  This  report 
provides  a  synoptic  evaluation  of  the  human  performance  measures  studied  as 
part  of  the  PETER  Program. 

Repeated  Measures  Applications 


There  are  many  situations  in  which  it  is  useful  to  measure  repeatedly 
human  performance  capabilities  .  These  include  following  the  time-course  of 
performance  in  studies  of  vigilance,  maturation,  or  environmental  stress  (75), 
and  monitoring  recovery  from  an  injury  (80).  In  addition,  repeated  measures 
are  useful  in  evaluating  the  effectiveness  of  training  (45)  and  in  comparing 
the  effects  of  various  equipment  configurations  on  human  performance  (88). 

The  application  of  repeated  measures  spans  the  breadth  of  human  performance 
experimentation. 

Repeated  measures  experimentation  is  frequently  favored  in  applied 
situations  because  it  can  be  more  efficient  and  economical  than  alternate 
approaches  (110).  When  intertrial  correlations  are  constant  (i.e., 
di fferentially  stable),  the  power  of  repeated  measures  analysis-of-variance 
increases  with  the  magnitude  of  the  correlations  and  considerable  economy  is 
realized  (101).  When  two  sets  of  measures  have  constant  correlations,  the 
power  of  differential  (correl ational )  analyses  may  also  be  substantially 
increased  by  the  use  of  correlated  averages  (35,  90)  or  more  potently,  by 
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averaged  correlations  (11,  14,  32).  However,  economical  use  of  subjects  may 
provide  the  paramount  rationale  for  repeated  measures.  This  is  true  when 
there  is  a  scarcity  of  qualified  subjects  or,  more  importantly,  when  there  are 
hazards  associated  with  the  experimentation  (15,25).  Repeated  measures 
designs  permit  the  use  of  fewer  subjects,  but  in  addition,  they  minimize  the 
total  exposure  time.  Clearly,  it  is  important  to  consider  the  task 
characteristics  required  for  repeated  measures  applications  before  conducting 
research. 

» 

Criteria  for  Repeated  Measures 


Repeated  measurements  must  possess  certain  characteristics  in  order  to  be 
meaningful,  and  to  be  easily  and  clearly  interpretable  (3,56,77).  First,  the 
measurements  must  represent  a  constant  mixture  of  human  performance 
capabilities  on  each  trial  of  repeated  measurement.  In  its  simplest  form, 
this  requirement  implies  that  the  relative  differences  between  subjects  on  the 
capability  being  measured  remain  constant  across  all  trials  of  repeated 
measurement.  This  requirement  for  meaningful  repeated  measurements  can  be  met 
objectively  by  showing  that,  apart  from  measurement  errors,  intertrial 
correlations  are  unchanging  (differentially  stable)  and  variances  are 
homogeneous  across  baseline  repetitions  (9,57,77).  Differential  stability,  .n 
this  context,  provides  assurance  that  the  entity  which  is  being  measured  is 
remaining  constant  (2).  Stated  technically,  differential  stability  and 
constant  variances  make  up  the  compound  symmetry  requirement  of  the 
variance-covariance  matrix  for  simple  repeated  measures  analysis  of  variance 
(110).  Together,  differential  and  variance  stability  are  required  for 
simplified  analysis  and  interpretation. 

The  second  requirement  for  meaningful  and  interpretable  repeated 
measurements  is  that  practice  effects  must  be  nil  or  predictable.  In  this 
regard,  Lord  and  Novick  (77)  point  out  that  repeated  measurements  may  be 
useful  if  mean  scores  change  by  an  additive  constant  from  one  trial  to 
another.  Campbell  and  Stanley  (17),  in  their  classic  discussion,  illustrate 
the  principle  that  the  additive  constant  should  be  the  same  from  one  trial  to 
the  next;  the  cumulative  effect  should  have  no  more  than  a  linear  trend 
(preferably  with  near  zero  slope).  Campbell  and  Stanley  have  also  noted  that 
nonlinear  changes  across  repeated  measurements  impede  or  make  impossible 
interpretation  of  effects  of  experimental  interventions. 

In  sum,  the  statistical  requirements  for  easily  interpretable  results  of 
repeated  measures  include  level  or  linearly  increasing  means,  level  variances, 
and  differential  stability. 

PETER  Paradigm 


The  PETER  Program  has  focused  largely  upon  determining  when,  if  ever, 
practiced  capability  measures  meet  the  criteria  for  repeated  measures 
applications.  In  the  typical  evaluation  procedure,  a  moderate  number  (10-25) 
of  subjects  were  assessed  daily  for  15  days  under  baseline  conditions  at  the 
same  time  of  day.  Also,  massed  practice  effects  were  investigated  in  more 
abbreviated  (3-  to  10-day)  studies  in  which  multiple  trials  were  given  within 
a  day  (71,74). 
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A  sequential  strategy  was  employed  in  all  studies  to  assess  when  means, 
variances,  and  intertrial  correlations  became  stable  (12,18).  For  the  most 
part,  this  strategy  involved  dropping  leading  trials  (these  were  usually  daily 
scores)  until  an  appropriate  test  statistic  was  conservatively  nonsignificant 
(£  >  0.1).  For  massed  trials,  within  a  day,  the  procedure  was  altered,  on  a 
case-by-case  basis,  to  focus  on  trials  not  affected  by  massing  effects,  (e.g., 
first  trials  across  days).  In  sum,  the  PETER  paradigm  was  aimed  at 
determining  when,  if  ever,  practiced  tasks  obtained  mean,  variance,  and 
differential  stability. 

9 

Subjects  were  U.  S.  Navy  enlisted  men,  ages  18-28,  who  had  volunteered 
for  assignment  to  this  laboratory  as  full-time  research  participants  under 
provisions  of  informed  consent  (SECNAVINST  3900.39  Series  and  NAVMEDCOMINST 
3900.5  Series).  Subjects  were  selected  for  physical  and  other  characteristics 
to  participate  in  biodynamic  research.  They  were  intellectually  typical  of 
enlisted  personnel  (102). 

Purpose 

The  purpose  of  this  report  is  to  provide  an  appl  ications-oriented  review 
of  the  performance  measures  evaluated  as  part  of  the  PETER  Program.  Results 
for  112  measures  are  classified  for  their  potential  utility  for  the 
practioner.  Discussion  covers  the  application  of  the  results  and  implications 
for  past  and  future  research. 

METHOD 

A  survey  was  conducted  and  salient  features  were  extracted  from  tasks 
studied  in  the  PETER  Program.  Measures  were  categorized  into  four  classes, 
depending  upon  their  utility  for  repeated  measures  applications:  Recommended, 
Acceptable-But-Redundant,  Marginal,  and  Unacceptable. 

Survey  of  Performance  Measures 

More  than  140  performance  measures  were  identified  initially  from 
documents  listed  in  a  recent  bibliography  of  the  PETER  Program  (50).  Many 
tasks  were  excluded  from  consideration  as  they  had  been  eliminated  in  the 
early  stages  of  analysis,  or  were  still  at  an  early  developmental  level.  The 
poor  reliabilities  and  stabilities  of  difference,  proportion,  slope,  and  other 
derived  measures  eliminated  many  of  them  from  consideration  for  repeated 
measures  applications  and  discouraged  complete  documentation  (10,22).  Some 
computer  mechanized  tasks  were  not  considered  because  they  still  required 
substantial  development.  These  tasks  frequently  had  less  reliability  than 
their  paper-and-pencil  counterparts,  or  had  questionable  construct  validity 
(71,74,96).  Other  computer  tasks  which  appear  to  have  desirable  metric 
qualities  have  been  developed  but  are  not  in  a  sufficiently  advanced  stage  to 
be  included  in  this  review  (16).  Overall,  a  total  of  112  of  the  original  140 
performance  measures  were  finally  judged  adequate  for  complete  reporting  of 
the  critical  elements  outlined  in  Table  1. 

Mean,  variance,  and  differential  stability  results  for  the  112  selected 
measures  were  evaluated  for  comparability  before  the  features  were  extracted. 
This  was  necessary  because  statistical  and  interpretive  methodology  had 
evolved  over  the  seven  years  of  the  PETER  Program  (12,13).  Evaluations  of 
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differential  stability,  for  example,  were  conducted  by  a  half  dozen  approaches 
ranging  from  analysis  based  on  graphical  approaches  (9)  to  analysis  based  on 
the  work  of  Steiger  (97,98).  Where  analyses  were  not  comparable,  data  were 
reanalyzed  by  appropriate  techniques  (12).  This  was  required,  for  example, 
where  factor  analysis  was  the  method  for  establishing  differential  stability 
(58).  Hence,  the  stability  results  were  made  comparable  for  the  112  measures 
before  salient  features  were  extracted. 

Categorization  of  Measures 


In  the  second  stage  of  the  investigation  the  112  measures  were 
categorized  into  the  four  groups:  Recommended,  Acceptable-But-Redundant, 
Marginal,  and  Unacceptable.  This  categorization  was  based  upon  joint 
consideration  of  task  stability  and  task  definition.  This  classification  was 
designed  as  a  guide  for  the  selection  of  tasks  for  environmental  and  other 
repeated  measures  studies. 

Recommended.  Measures  in  this  category  were  those  that  clearly  obtained 
total  stabilization  and  had  an  acceptable  level  of  reliability  efficiency 
(i.e.,  rxX  >  .50,  when  normalized  to  a  three-minute  administration).  This 
level  of  reTiability  was  required  for  categorization  as  Recommended  based  upon 
earlier  considerations  of  the  statistical  power  of  repeated  measures  designs 
(12). 


Acceptable-But-Redundant.  These  measures  had  met  the  same  requirements 
as  those  in  the  Recommended  category,  but  had  been  found  redundant  by  factor 
analysis  or  related  studies  of  stabilized  tasks.  In  addition  to  being 
redundant,  these  measures  generally  had  slightly  less  reliability  than  their 
counterparts  classified  as  Recommended. 

Marginal.  Marginal  measures  were  distinguished  by  either  instability  of 
means  or  variances  throughout  practice,  questionable  differential  stability, 
or  less  than  a  modicum  of  reliability  efficiency  (.15£rxx  < .  50 ) .  These 
measures  usually  had  desirable  features  which  were  outweighed  by  faults. 

Unacceptable.  Measures  in  this  category  were  characterized  by  either 
differential  instability  or  weak  reliability  efficiency  (rxx  <  .15).  This 
category  contained  an  inordinate  number  of  slope,  difference,  proportion,  and 
other  derived  r^asures. 


RESULTS 


The  tasks  are  categorized  as  Recommended,  Acceptable-But-Redundant, 
Marginal,  or  Unacceptable  in  Tables  2  through  5.  Definitions  of  the  task 
features  listed  in  the  table  headings  are  given  in  Table  1. 

R e c omme nded  and  Acceptable-But-Redundant 


The  Recommended  and  Acceptable-But-Redundant  measures  are  summarized  in 
Tables  2  and  3,  respectively.  Table  2  is  made  up  of  30  measures  of  cognitive 
(17),  perceptual  (11),  and  motor  (7)  performance.  (Mote  that  Contrast 
Sensitivity  constitutes  five  measures.)  Table  3,  which  contains  the 
Acceptable-But-Redundant,  is  made  up  of  15  measures  which  are  primarily 
cognitive  and  perceptual.  The  scarcity  of  motor  measures  reflects  an  emphasis 
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on  factor-analytic  and  related  differential  studies  of  cognitive  and 
perceptual  tasks  during  the  PETER  Program.  The  Recommended  and  Acceptable- 
But-Redundant  categories  contain  a  wide  range  of  tests  of  individual 
capabilities  which  we  consider  suitable  for  repeated  measures  research. 

Margi  nal 


Table  4  summarizes  35  measures  which  had  one  or  more  undesirable  features 
and,  therefore,  could  not  be  designated  as  totally  suitable  for  repeated 
measures  applications.  Cognitive  components  are  present  in  20  measures. 

Major  perceptual  components  are  present  in  27  measures,  including  10  for 
Contrast  Sensitivity.  Fourteen  measures,  including  7  microprocessor-based 
games,  have  major  motor  components.  Over  the  35  measures,  4  of  them  are  slope 
or  difference  scores.  Flaws  have  been  found  in  a  broad  range  of  performance 
measures. 

Some  of  these  tests  could  be  of  limited  use  in  their  present  form.  For 
example,  otherwise  flawed  measures  which  became  differentially  stable  with 
high  reliability  efficiences  might  be  employed  in  purely  differential 
correlational  studies  in  which  changes  in  the  means  and  variances  were  of  less 
interest.  Other  measures  which  may  obtain  total  stability  but  had  weak 
reliability  efficiencies  (rxx  <  .50)  might  be  considered  for  application  if 
there  were  no  other  measure  of  that  capability  available.  Extensive 
repetitions  (more  trials)  would  be  required  to  insure  power  in  cases  wnere 
reliabilities  are  weak.  However,  before  use  of  these  measures,  consideration 
should  be  given  to  task  or  scoring  changes  which  could  eliminate  the 
undesirable  features.  Overall,  while  these  Marginal  tasks  have  some  potential 
for  application,  first  consideration  should  be  given  to  making  them  suitable. 

Unacceptable 

Table  5  lists  32  measures  found  unsuitable  for  repeated  measures 
applications  in  their  present  form.  Thirteen  of  these  measures  have  primarily 
cognitive  components.  Of  the  17  measures  having  major  perceptual  components, 
10  measures  are  summarized  under  the  two  entries  for  visual  contrast 
sensitivity.  The  remaining  four  measures  have  major  motor  components.  Ratio, 
slope,  intercept,  difference,  and  various  derived  scores  make  up  11  of  the  32 
measures  categorized  as  Unacceptable. 

DISCUSSION 


The  stability  of  112  performance  measures  administered  repeatedly  under 
baseline  conditions  was  reviewed.  It  was  found  that,  although  largely  drawn 
from  performance  batteries,  only  45  measures  could  be  judged  as  Recommended  or 
Acceptable-but-Redundant.  Thus  only  about  40T,  of  the  well -practiced  measures 
demonstrated  total  (mean,  variance,  and  differential)  stability.  These  and 
related  findings  provide  a  basis  for  the  selection  of  tasks  and  pretest 
stabilization  periods  and  will  be  discussed  in  this  section.  Methods  of 
scoring,  implications  for  the  current  environmental  effects  literature,  and 
other  findings  will  also  be  discussed. 
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Test  Selection  and  Use 


The  results  of  the  present  review  provide  guidance  for  performance  test 
selection  and  utilization.  Table  2  delineates  a  range  of  30  perceptual, 
cognitive,  and  motor  measures  which  should  be  considered  for  repeated  measures 
applications.  Tables  3  to  5  outline  92  measures  which  cannot  be  recommended. 
In  particular.  Table  3  lists  measures  found  suitable  for  repeated  measures 
appl ications,  but  redundant  with  those  in  Table  2.  Table  4  lists  measures  of 
questionable  utility  in  their  present  format  which  should  be  considered  for 
application  only  when  no  comparable  measure  can  be  found  in  the  Recommended  or 
Acceptable-But-Redundant  categories.  Substantial  task  development,  to 
eliminate  flaws,  is  recommended  for  measures  in  this  category  before  their 
use.  Table  5  lists  measures  found  unsuitable  for  repeated  measures  use  in 
their  present  format.  In  sum,  measures  suitable  and  unsuitable  for  repeated 
measures  applications  are  identified  in  Tables  2  through  5.  The  researcher 
may  consult  these  tables  to  determine  the  utility  of  a  particular  measure  or 
the  likely  stability  of  a  related  one. 

Table  2  provides  selection  and  utilization  information  in  addition  to 
being  an  aggregation  of  fully  suitable  measures.  Factor  and  domain 
information,  in  particular,  may  be  used  to  identify  subsets  of  me  por  a 

particular  application.  For  example,  Guignard,  Bittner,  and  Car'  !47)  used 
such  an  approach  to  identify  five  perceptual,  cognitive,  and  mote  measures 
for  use  in  an  investigation  of  whole-body  vibration.  Reliability  e  'iciency 
data  may  be  employed  to  select  sensitive  tasks  from  measure  subset-.  High 
reliability  efficiencies  provide  for  statistical  power  (20,101).  For  example, 
the  approach  of  Guignard  et  al .  (47)  has  been  used  to  select  a  mini-battery 
for  environmental  applications.  Table  6  characterizes  this  battery  which 
contains  tasks  designed  to  assess  left  and  right  hemisphere  functions,  as  well 
as  fine  perceptual  motor  and  arm  movement  speed.  The  mini-battery  assesses 
five  measures  with  reliabilities  above  ,85  in  less  than  10  minutes. 

Prior  to  task  selection,  total  stabilization  time  may  be  used  in  planning 
the  amount  of  experimental  practice  time.  Guignard  et  al.  (47)  used 
stabilization  time  information  in  planning  their  study.  Anticipating  the 
effects  of  massed  practice  on  stabilization,  Krause  and  Woldstad  (74)  allowed 
more  practice  than  the  minimum  required  for  distributed  practice.  Altogether, 
the  factor,  domain,  reliability,  and  stabilization  information  are  an  aid  for 
selection  and  utilization  of  experimental  tasks. 

Scoring  Methods 


Analysis  of  the  112  measures  indicated  that  derived  scores  frequently 
have  undesirable  properties  (10,22,52).  Specifically,  none  of  the  15 
difference,  slope,  or  proportion  scores  may  be  seen  in  either  the  Recommended 
or  Acceptable-But-Redundant  category;  while  45  of  the  97  nonderived  scores  are 
classified  in  these  categories  U  (i)  =  9.47;  £<  .005).  This  association 
underestimates  that  across  all  140  measures,  derived  scores  made  up  a 
di sproportionate  number  that  were  dropped  early  from  consideration  because  of 
poor  statistical  character! sties.  Overall,  derived  scores  are  associated  with 
ratings  of  Marginal  or  Unacceptable. 

During  the  present  study,  a  combination  of  analytic  and  empirical 
evidence  was  uncovered  which  questions  the  use  of  di f ference-rel ated  scores. 
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Specifically,  this  report  supports  the  analytic  results  of  Cronbach  and  Furby 
(28)  who  found  that  difference  scores  tend  to  be  unreliable  and  of 
questionable  utility.  For  example,  Harbeson,  Krause,  Kennedy,  and  Bittner 
(52)  found  that  the  Stroop  interference  score  possessed  low  reliability. 
Moreover,  this  score  was  found  to  reflect  a  difference  between  two  variables 
of  virtually  identical  factor  composition.  Paralleling  Chronbach  and  Furby, 
Carter  and  Krause  (22)  have  demonstrated  analytically  that  slope  scores  have 
properties  similar  to,  if  not  isomorphic  with,  difference  scores.  In 
addition,  they  reported  empirical  slope  score  results  which  tended  to  exhibit 
low  reliability  and  differential  instability  over  a  series  of  information 
processing  tasks  such  as  Short-Tenm  Memory  Scanning  (99,100),  and  Letter 
Search  (84).  Similarily,  Bittner  (10)  has  demonstrated  analytically  the 
potential  for  undersirable  properties  with  proportion-of-baselines  and  other 
ratios  of  random  variables  which  are  also  difference-related.  These 
properties  were  seen  in  earlier  research  (83,95)  and  indicate  that  often 
results  using  proportion  of  baseline  may  be  artifactual  (10).  The  present 
review  suggests  that  the  use  of  difference,  slope,  and  proportional  scores 
should  be  questioned. 

The  frequently  undesirable  properties  of  difference-related  scores 
suggest  a  cautious  empirical  examination  before  they  are  used.  Examination  of 
theoretical  models  for  individual  subject  derived  scores  may  be  recommended  as 
a  first  step  (10,22).  As  a  second  step,  the  methods  for  stability  analysis 
described  earlier  (12)  are  also  recommended  after  selection  of  an  appropriate 
model.  Evaluation  the  of  the  stability  of  difference-related  scores  is 
recommended  to  ensure  meaningful  experimentation. 

Implications  for  the  Environmental  Literature 


The  finding  that  only  40%  of  the  well -practiced  tasks  demonstrated  total 
stability  across  repeated  measurements  brings  into  question  the  validity  of 
that  part  of  the  performance  literature  based  on  repeated  measures.  Failure 
to  meet  the  assumptions  of  total  staoility  may  be  catastrophic.  Nonlinear 
changes  in  means  may  render  i nterpretation  of  intervention  effects  difficult, 
if  not  impossible  (17).  In  addition,  a  failure  to  obtain  joint  variance  and 
differential  stability  implies  seriously  distorted  statistical  tests  for 
effects  and,  consequently,  misleading  evidence  as  to  the  presence  of  such 
effects  (91,110).  Examining  mean  and  variance  stability,  graphically  and 
otherwise,  is  a  good  first  step  before  initiating  investigations. 
Unfortunately,  stability  of  means  and  variances  does  not  imply  differential 
stability  (80).  Examining  only  means  and  variances  may  result  in  failure  to 
identify  changes  in  the  nature  of  the  construct  which  is  being  measured; 
differential  changes  make  the  meaningful  interpretation  of  the  results 
virtually  impossible.  Failure  to  attend  to  task  stability  may  be  a  source  of 
the  difficulties  in  meta-analyses  of  envi  roimental  literature  (40).  It  is 
concluded  that  the  validity  of  much  of  the  environmental  research  literature 
could  be  questioned  on  the  grounds  of  possible  instability  of  repeated 
measures. 

The  results  of  this  review  also  support  the  validity  of  part  of  the 
environmental  literature.  Many  investigators  have  used  one  of  the  measures 
identified  in  the  Recommended  or  Acceptabl e-But-Redundant  categories  in  the 
present  review  and  have  practiced  subjects  sufficiently  to  have  obtained 
stability.  Baddoloy's  (4)  Grammatical  Reasoning  test,  for  example,  has  been 
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TABLE  3:  ACCEPTABLE-BUT-REOUNDANT 


NAME 

FACTOR 

0 

0 

M 

A 

1 

N 

JT 

ADMIN  T  D 

TIME  Y  H 

(Mill)  P  I 

E  N 

TOT  STAB 
TIME  IN 
MINUTES 
(DIFF) 

ft  C  "3 "" 
E  F 

L  F  M 

I  I  1 

A  C  N 

8 

REFERENCES 

ARITHMETIC 

COMPUTATION 

NUMBER  FACILITY 
(N)  (EKSTROM  ET 

AL.  .  1976) 

C 

10  G 

90 ( < 10) 

0.83 

SEAltS  ET  AL.  (1980) 

f 

ARITHMETIC: 

NUMOER 

FACILITT 

NUMBER  FACILITY 
(N)  (EKSTROM  ET 

AL..  1976) 

C 

3  G 

27(27) 

0.83 

BITTNER  ET  AL . ( 1983) ; 

MORAN  A  MEFFCRO  (  1959) 

CHOICE 

REACTION  TIME: 
2-ChOICE 

CHOICE  REACTION 

TIME  (OONDERS, 

1868 ) 

P 

5.0  I 

35(35) 

0.51 

KRAUSE  A  BITTNER  (  1982) ; 
TEICHNER  A  KREBS  (1974) 

CRAPHEM1C  AND 
PHONEMIC  ANAL¬ 
YSIS:  SENSE/ 
homophone 

VISUAL  OR 

GRAPHEMIC 

ENCODING  (BARON  A 
MCK1LLOP,  197S) 

C 

8  G 

40(40) 

0.66 

HARBESOM,  KENNEDY.  ET  AL . 

(  1 982 ) ;  BARON  A  MCKILLOP 
(1975);  ROSE  A  FERNANDES 
(  1977) 

GRAPHEMIC  AND 
PHONEMIC  ANAL- 
YS 1  S  :HOMOPIION£ 
/NONSENSE 

ACOUSTIC  OR 
PHONEMIC  ENCODING 
(BARON  A 

MCKILLOP,  1975) 

C 

8  G 

72(72) 

0.73 

HARBESON,  KENNEDY.  ET  AL . 
(1982);  BARON  A  MCKILLOP 
(1975);  ROSE  A  FERNANDES 
(197?) 

LETTER  CLASS¬ 
IFICATION: 
PHYSICAL  MATCH 

PATTERN  MATCHING 
(POSNER  A 

MITCHELL.  1967) 

P 

12  G 

108(108) 

0.52 

HARBESON,  KENNEDY.  ET  AL . 
(1982);  POSNER  A  MITCHELL 
(1967);  ROSE  A  FERNANDES 
(1977) 

LETTER  SEARCH: 
TIME  PER  CORR. 
ITEM 

VISUAL  SEARCH 
(NEISSER  ET  AL., 
1963) 

P 

3  G 

27(27) 

0.87 

CARTER  A  KRAUSE  ( 1983) ; 
CARTER  A  SBISA  ( 1982) 
SHANNON  ET  AL.  (IN  PRl  SS) 

MINNESOTA  RATE 

MANIPULATION: 

PLACING 

MANUAL  DEXTERITY 
(FLEISHMAN  A 
ELLISON.  1962) 

M 

3-S  1 

42(42 

0.61 

CARTER,  STONE.  A  BITTNER 
( 19B2 ) ;  SCHOENFELOT  (1972) 

NUMBER 

COMPARISON 

PERCEPTUAL  SPEED 
(P)  (EKSTROM  ET 
AL..  1976) 

P 

3  G 

27(9) 

0.84 

BITTNER  ET  AL .  (1983); 
CARTER  A  SBISA  (1982) 

PATTERN 

RECOGNITION 

TIME  PER  COR¬ 
RECT  ITEM 

PATTERN 

RECOGNITION 
(FITTS,  WEINSTEIN, 
RAPPAPORT,  ET  AL. , 
1956) 

P 

2  G 

20(20) 

0.76 

CARTER  A  SBISA  (1982); 
CARTER  A  KRAUSE  (1983) 

PUROUE 

PEGBOARD 

FINE  FINGER 
DEXTERITY 
(TIFFIN.  1968) 

P 

M 

2  I 

42(42) 

0.90 

KRAUSE  A  WOLDSTAO  (1983); 
TIFFIN  (1968) 

RANDOM  FIELD 
NUMOER  SEARCH: 
TIME  PER  COR¬ 
RECT  ITEM 

VISUAL  SEARCH 

P 

5  G 

35(35) 

0.55 

SHANNON  ET  AL .  (IN  PRESS) 
CARTER  A  SBISA  ( 1982) 

SPEED  OF 

CLOSURE 

CLOSURE.  VERBAL 
(CV)  (EKSTROM  ET 
AL..  1976) 

P 

2.5  G 

28(25) 

0.80 

BITTNER  ET  AL .  (1983); 
MORAN  A  MEFFCRO  (1959) 

STROOP: 

BLACK  A  WHITE 
WORDS  (BW) 

PERCEPTUAL  SPEED 
(JENSEN  A  ROIIWER, 
1966) 

P 

0.5  G 

1.51.5) 

0.96 

HARBESON,  KRAUSE.  ET  AL . 
(1982) 

5 TROOP:  COLOR 
BLOCKS  (CB) 

MIXED 

P 

0.5  G 

3.513.5 

)  0.98 

HARBESON.  KRAUSE,  ET  AL . 
(1982) 

TABLE  2:  RECOMMENDED  (CONTINUED) 


D 

A 

~r 

E 

1 - 

NAME 

FACTOR 

0 

ADMIN 

T 

D 

TOT  STAB 

E 

F 

REFERENCES 

M 

TIME 

Y 

M 

TIME  IN 

L 

F 

M 

A 

(MIN) 

P 

I 

MINUTES 

1 

I 

1 

I 

E 

N 

(DIFF) 

A 

C 

N 

N 

B 

MANIKIN  TEST: 

LOG.  LATENCY 

SPATIAL 

TRANSFORMATION 
(EGAN,  1978) 

P 

7 

I 

14(14) 

0.79 

CARTER  4  WOLDSTAD  (IN 

PRESS) ;  READER,  'BENEL  ,  4 

RAHE  (1981) 

MINNESOTA  RATE 

OF  MANIPULA¬ 
TION:  TURNING 

MANUAL  DEXTERITY 
(FLEISHMAN  4 
ELLISON,  1962) 

M 

2-4 

I 

10(10) 

0.64 

CARTER,  STONE,  4  BITTNER 
(1982);  SCHOENFELDT  (1972) 

PATTERN 
COMPARISON: 
NUMBER  CORRECT 
MINUS  NUMBER 
INCORRECT 

SPATIAL  ABILITY 
(KLEIN  4 

ARMITAGE,  1979) 

P 

2 

G 

18(18) 

0.93 

SHANNON,  CARTER,  4  BOUDREAU 
(IN  PRESS);  KLEIN  4 

ARMITAGE  (1979);  CARTER  4 

SB  ISA  (1982) 

PERCEPTUAL 

SPEED 

PERCEPTUAL  SPEED 
(PS)  (EKSTROM  ET 
AL.,  1976) 

P 

2.5 

G 

23(15) 

0.86 

BITTNER  ET  AL.  (1983); 

MORAN  4  MEFFERD  (1959) 

SEARCH  FOR 

TYPOS  IN 

PROSE:  MEDIAN 
DETECTION  TIME 

READING  SPEED 

P 

6 

I 

54(54) 

0.65 

SHANNON  ET  AL.  (IN  PRESS); 
CARTER  4  KRAUSE  (1983) 

SPOKE 

CONTROL  (C) 

TASK 

SPEED  ARM  MOVE¬ 
MENT  (FLEISHMAN 

4  ELUSION ,  1962) 

M 

0.67 

APPROX 

G 

1(1) 

0.95 

BITTNER,  LUNDY,  KENNEDY, 

4  HARBESON  (1982) 

STERNBERG  I TTM 
RECOGNITION: 
POSITIVE  SET  1 

SHORT  TERM  MEMORY 
SCAN  (STERNBERG, 
1966,  1975) 

C 

3 

I 

18(18) 

0.70 

CARTER,  KENNEDY,  BITTNER, 

4  KRAUSE  (1980);  STERNBERG 
(1969,  1975) 

STERNBERG  ITEM 
RECOGNITION: 
POSITIVE  SET  4 

SHORT-TERM  MEMORY 
SCAN  (STERNBERG, 
1966,  1975) 

C 

3 

I 

15(9) 

0.80 

CARTER  ET  AL .  (1980); 

CARTER  4  KRAUSE  (1983); 
STERNBERG  (1969,  1975) 

STROOP:  COLOR 
WORDS  (CW) 

MIXED 

C 

P 

0.5 

G 

1.5(1. 5) 

0.97 

HARBESON,  KRAUSE,  KENNEDY, 

4  BITTNER  (1982) 

TRACKING: 

CRITICAL 

TRACKING,  CRITICAL 
( JEX  ET  AL.,  1966) 

P 

M 

1 

I 

100(100) 

0.60 

DAMOS  ET  AL.  (1984);  JEX 

ET  AL.  (1966) 

TRACKING: 

DUAL  CRITICAL 

TRACKING,  CRITICAL 
4  DUAL  FACTOR? 
(DAMOS,  BITTNER, 
KENNEDY,  4 
HARBESON,  1981) 

P 

M 

1 

1 

100(100) 

0.50 

DAMOS  ET  AL.  (1981) 

VISUAL 

CONTRAST  SENSI¬ 

P 

3 

1 

<1  ( <1 ) 

0.51 

GINSBURG,  BITTNER,  KENNEDY 

CONTRAST 

TIVITY  FUNCTION: 

P 

3 

I 

<1( <1 ) 

0.52 

HARBESON  (1983);  GINSBURG 

SENSITIVITY: 

1 ,  2,  4,  8,  16  cpd 

P 

3 

I 

<1  ( <  1 ) 

0.74 

4  EVANS  (1982) 

METHOD  OF 

(GINSBURG  4  EVANS, 

P 

3 

I 

<1  ( <1 ) 

0.75 

INCREASING 

CONTRAST 

1982) 

P 

3 

I 

<  1  ( <  1 ) 

0.53 

WORO  FLUENCY 

WORD  aUENCY  (FW) 
(EKSTROM  ET  AL., 

C 

3 

G 

<1(<1) 

0.79 

CARTER,  CURLEY,  4  STYER 
(IN  REVIEW) 

1976) 


TABLE  2:  RECOMMENDED 


NAME 

FACTOR 

0 

0 

M 

A 

I 

N 

ADMIN 

TIME 

(MIN) 

A 

T  D 

Y  M 

P  I 

E  N 

TOT  STAB 
TIME  IN 
MINUTES 
(DIFF) 

R  E  3 

E  F 

L  F  M 

I  I  I 

A  C  N 

B 

REFERENCES 

AIMING 

AIMING:  FINE  EYE- 
HAND  COORDINATION 
(FLEISHMAN  4 
ELLISON,  1962) 

P 

M 

2 

G 

30(30) 

0.87 

KRAUSE  4  WOLDSTAD  (1983); 
FLEISHMAN  4  ELLISON  (1962) 

ARITHMETIC: 

VERTICAL 

ADDITION 

NUM8ER  FACILITY 
(N)  ( EKSTROM, 
FRENCH,  HARMON,  4 
DERMEN,  1976) 

C 

4 

G 

48(8) 

0.90 

BITTNER,  CARTER,  KRAUSE, 
KENNEDY,  4  HARBESON  (1983); 
CARTER  4  SBISA  (1982) 

ASSOCIATIVE 
MEMORY:  NUMBER 
CORR:  LIST  l 

ASSOCIATIVE 

MEMORY  (MA) 
(EKSTROM  ET  AL., 
1976) 

C 

2.5 

G 

20(20) 

0.65 

KRAUSE  4  KENNEDY,  1980 

CARTER  4  KRAUSE  (1983); 
UNDERWOOD  BORUCH  4  MALM1 
(1977) 

ATARI* 

AIR  COMBAT 
MANEUVERING 

PURSUIT  TRACKING 
(KENNEDY,  BITTNER 

4  JONES,  1981) 

P 

M 

2.25 

I 

135(135) 

0.63 

JONES,  KENNEDY,  4  BITTNER 
(1981);  KENNEDY,  BITTNER, 
HARBESON,  4  JONES  (  1982) 

ATARI* 

ANTIAIRCRAFT 

UNKNOWN 

P 

M 

2.25 

I 

126(126) 

0.67 

JONES  4  KENNEDY  (1983) 

WITH  ADAPTATIONS 

CHOICE 

REACTION 

TIME:  1-CHOICE 

SIMPLE  REACTION 
TIME  (DONDERS, 

1868) 

P 

5.0 

I 

35(35) 

0.58 

KRAUSE  4  BITTNER  (1982); 
TE1CHNER  4  KREBS  (1974) 

CHOICE 

REACTION  TIME: 
4-CHOICE 

CHOICE  REACTION 
TIME  (DONDERS, 

1868) 

P 

5.0 

l 

50(50) 

0.80 

KRAUSE  4  BITTNER  (1982); 
TEICHNER  4  KREBS  (1974) 

CODE 

SUBSTITUTION 

MEMORY  ASSOC. (MA) 
PERCEPTUAL  SPEED 
(P)(  EKSTROM  ET 
AL.,  1976) 

C 

P 

2.0 

G 

16(16) 

0.84 

PEPPER,  KENNEDY,  BITTNER. 

4  WIKER  (1980);  WECHSLER 
(1981) 

FLEXIBILITY 

OF  CLOSURE 

CLOSURE.  FLEXI¬ 
BILITY  OF  (CF) 
(EKSTROM  ET  AL., 
1976) 

P 

3 

G 

9(9) 

0.88 

BITTNER  ET  AL .  (1983); 

MORAN  4  MEFFERD  (1959) 

GRAMMATICAL 

REASONING 

REASONING.  LOGI¬ 
CAL  (RL)  (EKSTROM 
ET  AL.,  1976) 

C 

1.5 

G 

18(18) 

0.93 

BITTNER  ET  AL .  (1983); 
CARTER,  KENNEDY,  4  BITTNER 
(1981);  BADOELEY  (1968) 

GRAPHEMIC  AND 
PHONEMIC  ANAL¬ 
YSIS:  SENSE/ 
NONSENSE 

READING  SPEED 
(BARON  & 

MCKILLOP,  1975) 

C 

8 

G 

16(16) 

0.66 

HARBESON,  KENNEDY,  KRAUSE, 

4  BITTNER  (1982);  BARON  4 
MCKILLOP  (1973);  ROSE  4 
FERNANDES  (1977) 

LETTER  CLASS¬ 
IFICATION: 

NAME 

RETRIEVAL  FROM 

LTM  1  MATCHING 
(POSNER  4 

MITCHELL,  1967) 

C 

12 

G 

84(84) 

0.55 

HARBESON,  KENNEDY  ,  ET  AL . 
(1982);  POSNER  4  MITCHELL 
(1967);  ROSE  4  FERNANDES 
(1977) 

LETTER  CLASS¬ 
IFICATION: 
CATEGORY 

- it — m — rrr* 

RETRIEVAL  FROM 

LTM  4  MATCHING 
(POSNER  4 

MITCHELL,  1967 

C 

11 

G 

121(121) 

0.69 

HARBESON,  KENNEDY,  ET  AL . 
(1982);  POSNER  4  MITCHELL 
(1967);  ROSE  4  FERNANDES 
(1977) 

Continued  on  next  page. 
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TABLE  1.  DEFINITIONS  OF  TASK  FEATURES 


FEATURE 
( Abbrevi ations 
used  in  tables) 

DEFINITION 

NAME 

Name  of  the  task  or  measure  as  used  in  the  literature. 

FACTOR 

The  factor(s)  assessed  by  the  measure  as  identified  in 
the  literature  or  by  judgments  of  the  authors. 

DOMAIN 

Characterization  of  the  domain(s)  of  assessment  of  the 
capability  as  cognitive,  perceptual  (including  sensory), 
or  motor. 

ADMINISTRATION 

TIME  IN  MINUTES 
(ADMIN  TIME) 

The  typical  testing  time  for  a  measure;  this  includes 
all  testing  time  required  to  obtain  a  score  (e.g., 
components  of  a  derived  score) 

TYPE  OF 

ADMI NSTRATION 
(TYPE  ADMIN) 

Identification  of  task  as  individually  (I)  or  group  ( G) 
admi nistered. 

TOTAL  STABILIZATION 
TIME  IN  MINUTES 
(DIFFERENTIAL) 

The  total  stabilization  time  is  the  amount  of  elapsed 
experimental  time  (whether  massed  or  distributed) 
required  for  mean,  variance,  and  differential 
(correlational)  stabilization.  (The  amount  of  elapsed 
practice  time  required  for  Differential  Stabilization 
alone  is  in  parentheses). 

RELIABILITY 
EFFICIENCY 
( 3  mi nutes) 

The  differentially  stabilized  reliability  normalized  to 
a  3  minute  administration.  Normalization  to  3  minutes 
was  by  the  Spearman-Brown  Equation  (Bittner  &  Carter, 
1981;  Winer,  1971). 

REFERENCES 

Cited  in  order  are  the  relevant  stability  study,  the 
original  source  of  the  measure,  and  occasionally  other 
significant  references. 
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Environmental  Research  (PETER):  Navigational  plotting.  Aviat.  Space 
Environ.  Med.  1983;  54:144-149. 
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employed  in  a  substantial  number  of  environmental  investigations  (5,42,75,76, 
107).  Critical  tracking  (55)  has  also  been  extensively  employed  (30). 
Practice  levels  routinely  recommended  by  Jex  (personal  communication,  1983) 
exceed  those  recommended  in  this  report  by  60%.  Investigations  which  have 
appropriately  used  stable  measures  provide  a  firm  foundation  for  the 
understanding  of  envi roimental  effects. 

CONCLUSIONS  AND  RECOMMENDATIONS 

9 

Four  conclusions  and  associated  recommendations  emerged  from  the  present 
research : 


1.  The  tables  presented  in  this  report  are  a  guide  for  the  selection  and 
application  of  performance  tests  in  environmental  research. 

2.  Difference,  slope,  and  ratio  scores  frequently  possess  undesirable 
psychometric  properties,  and  their  cautious  empirical  examination  is 
recommended  before  application. 

3.  The  literature  on  performance  changes  due  to  environmental  factors 
should  be  reviewed  in  terms  of  stability  or  instability  of  measurements. 

4.  The  evaluation  of  the  psychometric  stability  of  performance  measures 
under  baseline  conditions  provides  a  foundation  for  environmental  research 
applications  using  repeated  measurements. 
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VARIANCE 

COLLINS  A  OUILLIAN  (1969); 

PROPERTY,  0- 

QU1LLIAN.  1969) 
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VARIANCE 

COLLINS  A  QUILLIAN  (1969); 

PROPERTY,  1ST 

QUILLIAN ,  1969) 

(11.69) 

KENNEDY  A  HARBESON  (IN 
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MORAN  4  MEFFERO  (1959) 

ET  AL.,  1976) 

VARIANCE 

(18) 

WONDERLIC 

GENERAL 

C 

12 

G 

48( <12!  0.34 

MACKAMAN,  BITTNER, 

PERSONNEL 

INTELLIGENCE 

HARBESON,  KENNEDY.  4  STONE 

TEST 

(WONDERLIC .  1978) 

(1982);  WONDERLIC  (1978) 

TABLE  5.  UNACCEPTABLE 


NAME 

FACTOR 

~rr 

O 

M 

A 

I 

N 

ADMIN 

TIME 

(MIN) 

— r 

T  D 
Y  M 
P  I 
E  N 

TOT  STAB 
TIME  IN 
MINUTES 
(DIFF) 

ft  l  3 

E  F 

L  F  M 

I  I  I 

A  C  N 

B 

REFERENCES 

ATARI* 

BREAKOUT 

SLOWLY  CHANGING 

4  UNKNOWN 

P 

M 

2.00 

APPROX 

I 

VARIANCE 

(UNSTABLE) 

0.41 

APPROX 

$ 

KENNEDY  ET  AL .  (1982) 

ATARI* 

ICE  RACE 

UNKNOWN 

P 

H 

1.00 

I 

UNSTABLE 

(UNSTABLE) 

0.38 

APPROX 

JONES  4  KENNEDY  (1983) 
WITH  ADAPTATIONS 

AUOITORY 

DIGIT  SPAN 
(BACKWARDS) 

MEMORY  SPAN  (MS) 
(EKSTROM  ET  AL., 
1976) 

C 

15 

G 

UNSTABLE 

(UNSTABLE) 

0.24 

APPROX 

EKSTROM  ET  AL .  (1976); 
MCCAFFERTY  ET  AL .  (1980) 

GRAPHEMIC  AND 
PHONEMIC  ANAL¬ 
YSIS:  SH/HN 

RATIO 

RELATIVE  VISUAL/ 
ACOUSTIC  ENCODING 
(BARON  4 

MCKILLOP,  1975) 

C 

16 

G 

UNSTABLE 

(UNSTABLE) 

0.00 

APPROX 

HARBESON,  KENNEDY,  ET  AL . 
(1982);  BARON  (1973); 
BARON  &  MCKILLOP  (1975); 
ROSE  l  FERNANOES  (1977) 

GRAPHEMIC  AND 
PHONEMIC  ANAL¬ 
YSIS:  t  ERRORS 

MIXED  (ROSE  1 
FERNANDES,  1977) 

c 

24 

G 

192(192) 

0.12 

HARBESON,  KENNEDY,  ET  AL. 
(1982);  BARON  (1973); 
BARON  4  MCKILLOP  (1975); 
ROSE  4  FERNANDES  (1977) 

INTERFERENCE 
SUSCEPTIBILITY 
SLOPE  ACROSS 
LISTS 

PROACTIVE  INTER¬ 
FERENCE  SUSCEPTI - 
BLITY  (UNOERWOOO 

ET  AL..  1977) 

c 

10 

G 

UNSTABLE 

(UNSTABLE) 

0.03 

APPROX 

CARTER  4  KRAUSE  (1983); 
UNDERWOOD  ET  AL .  (1977) 

LETTER  CLASS¬ 
IFICATION: 

N  -  P 

NAME  SEARCH  TIME 
(POSNER  4 

MITCHELL,  1967) 

c 

24 

G 

216(216) 

0.02 

HARBESON,  KENNEDY.  ET  AL . 
(1982);  POSNER  4  MITCHELL 
(1967);  ROSE  4  FERNANDES 
(1977) 

LETTER  CLASS¬ 
IFICATION: 

C  -  N 

CATEGORY  SEARCH 
TIME  (POSNER  4 
MITCHELL,  1967) 

c 

23 

G 

253(253) 

0.10 

HARBESON,  KENNEDY,  ET  AL. 
(1982);  POSNER  4  MITCHELL 
(1967);  ROSE  4  FERNANDES 
(1977) 

LEXICAL  DECI¬ 
SION  MAKING: 
GRAPHEMIC  AND 
PHONEMIC 
FACILITATION 

READING  SPEED 
(MEYER, 

SCHVAMEVELOT,  1 
RUDDY,  1974 ) 

c 

3 

G 

UNSTABLE 

(UNSTABLE) 

0.00 

APPROX 

KENNEDY  4  HARBESON  (IN 
PRESS);  MEYER  ET  AL. 
(1974);  ROSE  4  FERNANDES 
(1977) 

LEXICAL  DECI¬ 
SION  MAKING: 
GRAPHEMIC 
INTERFERENCE 

ACOUSTIC  OR  PHON¬ 
EMIC  EMCOOING 
(MEYER  ET  AL., 
1974) 

c 

3 

G 

UNSTABLE 

(UNSTABLE) 

0.00 

APPROX 

KENNEDY  4  HARBESON  (IN 
PRESS);  MEYER  ET  AL. 
(1974);  ROSE  4  FERNANDES 
(1977) 

LEXICAL  DECI¬ 
SION  MAKING: 
PHONEMIC 
SIMILARITY 

VISUAL  OR  GRAPH¬ 
EMIC  ENCODING 
(MEYER  ET  AL.. 
1974) 

c 

3 

G 

UNSTABLE 

(37) 

0.27 

APPROX 

KENNEDY  4  HARBESON  (IN 
PRESS);  MEYER  ET  AL. 
(1974);  ROSE  4  FERNANDES 
(1977) 

MAZE  TRACING 

SPATIAL  SCANNING 
(SS)  (EKSTROM  ET 
AL.,  1976) 

p 

2 

G 

NOT 

EQUIV¬ 

ALENT 

INESTI¬ 

MABLE 

KRAUSE  4  WOLDSTAD  (1983); 
SHANNON  (1982) 

NAVIGATIONAL 
PLOTTING:  PER¬ 
CENT  CORRECT 

MIXED  (WIKER  ET 
AL..  1983) 

c 

p 

M 

9 

G 

UNSTABLE 

(UNSTABLE) 

INESTI¬ 

MABLE 

WIKER  ET  AL.  (1983) 

Continued  on  next  page . 
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TABLE  5.  UNACCEPTABLE  (CONTINUED) 


K  ' 


rP.  - 

D 

A 

R 

E  i 

NAME 

FACTOR 

0 

ADMIN 

T  D 

TOT  STAB  E 

F 

REFERENCES 

M 

TIME 

Y  M 

TIME  IN  L 

F  M 

A 

(MIN) 

P  I 

MINUTES  I 

I  I 

I 

E  N 

(OIFF)  A 

C  N 

N 

B 

m 

RUNNING  RECOG- 

RECOGNITION  FROM 

C 

4 

G 

UNSTABLE  INESTI- 

HARBESON  ET  AL.  (1980); 

n 

NITION: 

SHORT  TERM  MMORY 

APPROX 

(UNSTABLE) 

MABLE 

FERNANDES  i  ROSE  (1978); 

■ 

NUMBER  CORRECT 

(UNDERWOOD  ET  AL.. 

UHDERWOOO  ET  AL.  (1977) 

1977) 

SEMANTIC  FCM- 

LTM  SCANNING 

C 

3,34 

G 

UNSTABLE 

0.00 

CARTER  A  KRAUSE  (1983); 

", 

ORY  RETIEVAL : 

(ROSE  A  FERNANDES, 

VARIANCES 

APPROX 

COLLINS  A  QUILL I AN  (1969); 

« 

PROPERTY  SLOPE 

1977) 

(16.77) 

KENNEDY  A  HARBESON  (IN 
PRESS);  ROSE  A  FERNANDES 
(1977) 

SEMANTIC  MEM- 

LTM  SCANNING 

C 

3.34 

G 

UNSTABLE 

0.00 

CARTER  A  KRAUSE  (1983); 

ORY  RETIEVAL: 

(ROSE  A  FERNANDES. 

VARIANCES 

APPROX 

COLLINS  A  QUILL I AN  (1969); 

SUPERSET  SLOPE 

1977) 

(36.7) 

KENNEDY  A  HARBESON  (IN 
PRESS);  ROSE  A  FERNANDES 
(1977) 

- 

STERNBERG  ITEM 

STIMULUS  PROCESS- 

P 

12 

I 

UNSTABLE 

0.00 

CARTER  ET  AL.  (1980); 

• 

RECOGNITION: 

ING  A  RESPONSE 

M 

(4  SET 

(UNSTABLE) 

APPROX 

CARTER  A  KRAUSE  (1983); 

INTERCEPT 

FORMATION  TIME 
( STERNBERG,  1966, 
1975) 

SIZES) 

STERNBERG  (1966,  1975) 

; 

STERNBERG  ITEM 

SHORT-TERM  MEMORY 

C 

12 

I 

UNSTABLE 

0.11 

CARTER  ET  AL.  (1980); 

RECOGNITION: 

SCAN  RATE 

(4  SET 

(UNSTABLE) 

APPROX 

CARTER  A  KRAUSE  (1983); 

SLOPE 

(STERNBERG,  1966, 

SIZES) 

STERNBERG  (1966,  1975) 

Q 

1975) 

r 

TIME 

PRODUCTION  TIME 

C 

15 

I 

UNSTABLE 

0.35 

MCCAULEY  ET  AL.  (1980); 

ESTIMATION: 

JUDGEMENT  (VROON, 

(5  REP 

(UNSTABLE) 

APPROX 

ZELKINO  A  SPRUNG  (1974) 

VARIABLE  ERROR 

1976) 

8  INT¬ 
ERVALS 

p 

E 

TRACKING:  DUAL 

MIXED 

P 

1 

I 

UNSTABLE 

0.00 

KENNEDY  ET  AL.  (1981); 

CRITICAL-TWO 

OIMENSIONAL 

COMPENSATORY 

M 

(UNSTABLE) 

APPROX 

DAMOS  ET  AL.  (1981) 

VISUAL 

CONTRAST  SENS1- 

P 

15 

1 

UNSTABLE 

VARIED 

GINSBURG  ET  AL.  (1983); 

CONTRAST 

TIVITY  FUNCTION: 

(EACH 

(UNSTABLE) 

GINSBURG  A  EVANS  (1982) 

SENSITIVITY: 

1.  2,  4,  8,  16  cpd 

cpd) 

• 

METHOO  OF 

(GINSBURG  A  EVANS, 

> 

A 

ADJUSTMENT 

1982) 

VISUAL 

CONTRAST  SENSI- 

P 

15 

I 

UNSTABLE 

VARIED 

GINSBURG  ET  AL .  (1983); 

CONTRAST 

TIVITY  FUNCTION: 

(EACH 

(UNSTABLE) 

GINSBURG  A  EVANS  (1982) 

SENSITIVITY: 

1.  2,  4,  8,  16  cpd 

cpd 

BEKESY  METHOO 

(GINSBURG  A  EVANS, 
1982) 

VISUAL 

VISUAL  ACUITY  A 

P 

1 

I 

INESTI- 

INESTI- 

GUIGNARO,  BITTNER, 

• 

RESOLUTION 

PERCEPTUAL  SPEED 

MABLE 

MABLE 

EINBENDER,  A  KENNEDY 

ACUITY:  ERRORS 

(1980);  GUIGNARO, 

LANDRUM  A  REAROON  (1976) 

» 

VISUAL 

VISUAL  ACUITY 

P 

1 

I 

INESTI- 

INESTI- 

GUIGNARO  ET  AL.  (1980); 

» 

RESOLUTION 

6  PERCEPTUAL 

MABLE 

MABLE 

GUIGNARO  ET  AL .  (1976) 

ACUITY:  TIME  SPEED 


TABLE  6.  MINI-BATTERY  FOR  ENVIRONMENTAL  RESEARCH 


NAME 

RATIONALE  FOR  INCLUSION 

ADMINISTRA¬ 
TION  TIME 

IN  MINUTES 

RELIABILITY 
EFFICIENCY 
(3  MINUTES) 

9 

GRAMMATICAL 

REASONING 

Assesses  an  analytic  cognitive 
neuropsychological  function 
associated  with  the  left 
hemi sphere. 

1.5 

0.93 

PATTERN 

COMPARISON: 

Assesses  an  integretive  spatial 
function  neuropsychologically 
associated  with  the  right 
hemi sphere. 

2.0 

0.93 

CODE 

SUBSTITUTION 

This  is  a  mixed  associative 
memory -perceptual  speed  task 
which  provides  for  a  tradition¬ 
al  assessment  of  these  compon¬ 
ents  not  otherwise  covered  by 
other  measures. 

2.0 

0.84 

AIMING 

Directly  provides  for  the 
assessment  of  environmental 
effects  on  fine  eye-hand 
coordination  and  indirectly 
provides  for  separation  of  such 
effects  from  other  cognitive 
measures. 

2.0 

0.87 

SPOKE 

CONTROL  (C) 
TASK 

Directly  assesses  arm  movement 
speed  and  indirectly  provides 
for  distinction  of  gross 
environmental  disruptions  from 
disruptions  in  fine  eye-hand 
coordination  and  cognition. 

<1.0 

0.95 

END 

k 

FILMED 

5-85 


DTIC 


